[ZEPPELIN-1280][Spark on Yarn] Documents for running zeppelin on production environments using docker. #1318

astroshim · 2016-08-10T23:15:43Z

What is this PR for?

This PR is for the documentation of running zeppelin on production environments especially spark on yarn.
Related issue is #1227 and I got a lot of hints from https://github.com/sequenceiq/hadoop-docker.
Tested on ubuntu.

What type of PR is it?

Documentation

What is the Jira issue?

https://issues.apache.org/jira/browse/ZEPPELIN-1280

Questions:

Does the licenses files need update? no
Is there breaking changes for older versions? no
Does this needs documentation? no

AhyoungRyu · 2016-08-11T01:44:13Z

docs/install/spark_cluster_mode.md

+
+
+```
+ps -ef


you mean ps -ef | grep spark ?

hadoop is also running so just ps -ef is the best way?

Ah right. But i just wanted to filter processes list.

AhyoungRyu · 2016-08-11T02:35:05Z

@astroshim Great work indeed! As just proof reading spark_cluster_mode.md, I updated a few minor things in here. Could you check this one please?

astroshim · 2016-08-11T04:50:37Z

@AhyoungRyu Thank you very much for your effort. 👍

Minor update for spark_cluster_mode.md

felixcheung · 2016-08-11T07:56:52Z

docs/_includes/themes/zeppelin/_navigation.html

                <li class="title"><span><b>Advanced</b><span></li>
                <li><a href="{{BASE_PATH}}/install/virtual_machine.html">Zeppelin on Vagrant VM</a></li>
                <li><a href="{{BASE_PATH}}/install/spark_cluster_mode.html#spark-standalone-mode">Zeppelin on Spark Cluster Mode (Standalone)</a></li>
+                <li><a href="{{BASE_PATH}}/install/spark_cluster_mode.html#spark-standalone-mode">Zeppelin on Spark Cluster Mode (Yarn)</a></li>


probably will be a good idea to all cap YARN
http://spark.apache.org/docs/latest/running-on-yarn.html

astroshim · 2016-08-11T08:15:11Z

@felixcheung Thank you very much for detail review. 👍
I'll fix them but I wonder if what spark and hadoop version should be supported in this document.

felixcheung · 2016-08-11T09:10:32Z

It's hard to say - I think one approach would be latest (spark 2.0 & Hadoop 2.7); another approach would be the most popular ones

astroshim · 2016-08-11T11:06:54Z

Do you know what version of spark&hadoop is popular? I can test it.

astroshim · 2016-08-11T12:30:43Z

Spark2.0 & hadoop2.3 is working well.

felixcheung · 2016-08-11T23:35:40Z

cool. hadoop versions in distributions:
CDH: 2.6.0
HDP/Azure: 2.7.1
EMR: 2.7.2
GCP Dataproc: 2.7.2

astroshim · 2016-08-12T02:36:30Z

Then what about supporting latest verison(Spark2.0 & hadoop2.7)?

bzz · 2016-08-12T03:38:33Z

Docs looks great to me, thank you @astroshim !

astroshim · 2016-08-12T13:04:08Z

Spark2.0 & hadoop2.7 is working well.

YARN
hdfs

as you can see test spark-submit job was successful but zeppelin job doesn't work properly.
I'll take a look at problem later.

astroshim · 2016-08-16T15:02:11Z

I got a following error when I try to run zeppelin with spark2.0&hadoop2.7.

ERROR [2016-08-16 16:43:33,121] ({pool-1-thread-3} Utils.java[invokeMethod]:40) -
java.lang.reflect.InvocationTargetException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:38)
        at org.apache.zeppelin.spark.Utils.invokeMethod(Utils.java:33)
        at org.apache.zeppelin.spark.SparkInterpreter.createSparkSession(SparkInterpreter.java:345)
        at org.apache.zeppelin.spark.SparkInterpreter.getSparkSession(SparkInterpreter.java:218)
        at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:743)
        at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
        at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getProgress(LazyOpenInterpreter.java:110)
        at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer.getProgress(RemoteInterpreterServer.java:447)
        at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:1701)
        at org.apache.zeppelin.interpreter.thrift.RemoteInterpreterService$Processor$getProgress.getResult(RemoteInterpreterService.java:1686)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.hadoop.yarn.conf.YarnConfiguration
        at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.newConfiguration(YarnSparkHadoopUtil.scala:71)
        at org.apache.spark.deploy.SparkHadoopUtil.<init>(SparkHadoopUtil.scala:54)
        at org.apache.spark.deploy.yarn.YarnSparkHadoopUtil.<init>(YarnSparkHadoopUtil.scala:56)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at java.lang.Class.newInstance(Class.java:383)
        at org.apache.spark.deploy.SparkHadoopUtil$.liftedTree1$1(SparkHadoopUtil.scala:414)
        at org.apache.spark.deploy.SparkHadoopUtil$.yarn$lzycompute(SparkHadoopUtil.scala:412)
        at org.apache.spark.deploy.SparkHadoopUtil$.yarn(SparkHadoopUtil.scala:412)
        at org.apache.spark.deploy.SparkHadoopUtil$.get(SparkHadoopUtil.scala:437)
        at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:2203)
        at org.apache.spark.storage.BlockManager.<init>(BlockManager.scala:104)
        at org.apache.spark.SparkEnv$.create(SparkEnv.scala:320)
        at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:165)
        at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:259)
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:423)
        at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2256)
        at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:831)
        at org.apache.spark.sql.SparkSession$Builder$$anonfun$8.apply(SparkSession.scala:823)
        at scala.Option.getOrElse(Option.scala:121)
        at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:823)
        ... 20 more

My build command is

mvn clean package -Pspark-2.0 -Phadoop-2.7 -Dhadoop.version=2.7.2 -Pyarn -Ppyspark -Pscala-2.11 -DskipTests

but hadoop library for spark interpreter is

~/zeppelin$ ls -al ./spark/target/lib/hadoop-*
-rw-rw-r-- 1 hsshim hsshim   17385  8월 16 23:52 ./spark/target/lib/hadoop-annotations-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim   49750  8월 16 23:52 ./spark/target/lib/hadoop-auth-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim    2559  8월 16 23:52 ./spark/target/lib/hadoop-client-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim 2735584  8월 16 23:52 ./spark/target/lib/hadoop-common-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim 5242252  8월 16 23:52 ./spark/target/lib/hadoop-hdfs-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim  482042  8월 16 23:52 ./spark/target/lib/hadoop-mapreduce-client-app-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim  656365  8월 16 23:52 ./spark/target/lib/hadoop-mapreduce-client-common-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim 1455001  8월 16 23:52 ./spark/target/lib/hadoop-mapreduce-client-core-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim   35216  8월 16 23:52 ./spark/target/lib/hadoop-mapreduce-client-jobclient-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim   21537  8월 16 23:52 ./spark/target/lib/hadoop-mapreduce-client-shuffle-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim 2015575  8월 16 23:52 ./spark/target/lib/hadoop-yarn-api-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim   94728  8월 16 23:52 ./spark/target/lib/hadoop-yarn-client-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim 1301627  8월 16 23:52 ./spark/target/lib/hadoop-yarn-common-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim  175554  8월 16 23:52 ./spark/target/lib/hadoop-yarn-server-common-2.2.0.jar
-rw-rw-r-- 1 hsshim hsshim   25710  8월 16 23:52 ./spark/target/lib/hadoop-yarn-server-web-proxy-2.2.0.jar

Maybe the error occurs because different versions of the hadoop library.
I will make an PR for this.

felixcheung · 2016-08-16T15:31:13Z

Looks like some of the Hadoop jars are 2.2 instead of 2.7?

astroshim · 2016-08-16T15:41:03Z

@felixcheung Yes 2.2, Maybe it's because my maven repo has different versions of hadoop libraries like following.

~/zeppelin$ ls -al ~/.m2/repository/org/apache/hadoop/hadoop-common/
total 36
drwxrwxr-x  9 hsshim hsshim 4096  8월 16 17:03 .
drwxrwxr-x 25 hsshim hsshim 4096  8월 17 00:22 ..
drwxrwxr-x  2 hsshim hsshim 4096  8월 16 17:04 2.2.0
drwxrwxr-x  2 hsshim hsshim 4096  6월 22 12:04 2.3.0
drwxrwxr-x  2 hsshim hsshim 4096  6월 22 00:07 2.4.0
drwxrwxr-x  2 hsshim hsshim 4096  6월 22 00:14 2.5.1
drwxrwxr-x  2 hsshim hsshim 4096  8월  4 19:54 2.6.0
drwxrwxr-x  2 hsshim hsshim 4096  8월 16 14:50 2.7.0
drwxrwxr-x  2 hsshim hsshim 4096  7월 14 16:58 2.7.2

so I fix for this on #1335.

astroshim · 2016-08-16T16:07:09Z

I got success zeppelin job with spark2.0&hadoop2.7 after apply #1335.

hadoop libraries.

~/zeppelin$ ls -al ./spark/target/lib/hadoop-*
-rw-rw-r-- 1 hsshim hsshim   17385  8월 17 00:51 ./spark/target/lib/hadoop-annotations-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim   70685  8월 17 00:51 ./spark/target/lib/hadoop-auth-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim    2545  8월 17 00:51 ./spark/target/lib/hadoop-client-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim 3443040  8월 17 00:51 ./spark/target/lib/hadoop-common-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim 8268375  8월 17 00:51 ./spark/target/lib/hadoop-hdfs-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim  516614  8월 17 00:51 ./spark/target/lib/hadoop-mapreduce-client-app-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim  753123  8월 17 00:51 ./spark/target/lib/hadoop-mapreduce-client-common-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim 1531485  8월 17 00:51 ./spark/target/lib/hadoop-mapreduce-client-core-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim   38213  8월 17 00:51 ./spark/target/lib/hadoop-mapreduce-client-jobclient-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim   48268  8월 17 00:51 ./spark/target/lib/hadoop-mapreduce-client-shuffle-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim 2015575  8월 17 00:51 ./spark/target/lib/hadoop-yarn-api-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim  142639  8월 17 00:51 ./spark/target/lib/hadoop-yarn-client-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim 1653294  8월 17 00:51 ./spark/target/lib/hadoop-yarn-common-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim  364376  8월 17 00:51 ./spark/target/lib/hadoop-yarn-server-common-2.7.2.jar
-rw-rw-r-- 1 hsshim hsshim   34953  8월 17 00:51 ./spark/target/lib/hadoop-yarn-server-web-proxy-2.7.2.jar

zeppelin screen.
yarn applications screen.

zjffdu · 2016-08-16T23:58:37Z

@astroshim I can use spark 2.0 and hadoop 2.7 successfully. I hit this issue when building zeppelin with profile yarn enabled. So please don't enable yarn profile otherwise you will get hadoop version mismatch. I have left a comment in #1301 to remove yarn profile.

astroshim · 2016-08-17T02:04:09Z

@zjffdu I just tested and got success with removing the yarn profile.
I will close the #1335 then.
Thank you.

astroshim · 2016-08-18T02:24:39Z

Please merge this if there is no more discussion because I want to make document for https://issues.apache.org/jira/browse/ZEPPELIN-1279.

zjffdu · 2016-08-18T02:47:16Z

The jira title seems a little confusing to me. The PR is for running spark on yarn by docker, but I don't think users will use docker for production for now.

astroshim · 2016-08-18T03:44:17Z

@zjffdu You're right, usually users don't make their production using docker.
but We have been received many question about running Zeppelin on their production environments so I think this PR gives hints users to solve their problems and making their production environments too.
Does this make sense?

zjffdu · 2016-08-18T03:54:55Z

It would be better to change the title to reflect the docker. I think we should mention docker is only for small experimental environment rather than production environment. Besides that, I don't know how much complicated of using docker, I would be more conservative to bring extras dependencies, especially when it is complicated and not usually needed in real environment. We can hear more feedback from people who know more about docker.

astroshim · 2016-08-18T04:16:21Z

I can update PR title
but This PR is one of the https://issues.apache.org/jira/browse/ZEPPELIN-1198 and
https://issues.apache.org/jira/browse/ZEPPELIN-1278 is already merged.
and If you build this PR, you can see the title on doc like Zeppelin on Spark Cluster Mode (YARN via Docker).

zjffdu · 2016-08-18T04:55:40Z

Thanks @astroshim, I have no other concerns.

felixcheung · 2016-08-18T09:50:36Z

I agree we could be more specific on the title/subject for this document.

But lots of company run production on Docker though, just FYI. Either Docker by itself on premise or in the cloud, with something like DC/OS.

AhyoungRyu · 2016-08-29T06:47:11Z

Can this be merged now? :)

bzz · 2016-08-29T07:01:55Z

Looks great to me.

Merging to master, if there is no further discussion

### What is this PR for? This PR is for the documentation of running zeppelin on production environments especially spark on mesos via Docker. Related issue is #1227 and #1318 and I got a lot of hints from https://github.com/sequenceiq/hadoop-docker. Tested on ubuntu. ### What type of PR is it? Documentation ### What is the Jira issue? https://issues.apache.org/jira/browse/ZEPPELIN-1279 ### How should this be tested? You can refer to https://github.com/apache/zeppelin/blob/master/docs/README.md#build-documentation. ### Questions: * Does the licenses files need update? no * Is there breaking changes for older versions? no * Does this needs documentation? no Author: astroshim <[email protected]> Author: AhyoungRyu <[email protected]> Author: HyungSung <[email protected]> Closes #1389 from astroshim/ZEPPELIN-1279 and squashes the following commits: 974366a [HyungSung] Merge pull request #10 from AhyoungRyu/ZEPPELIN-1279-ahyoung 076fdba [AhyoungRyu] Change zeppelin_mesos_conf.png file 1cbe9d3 [astroshim] fix spark version and mesos 2b821b4 [astroshim] fix docs 159bafc [astroshim] fix anchor d8c43b4 [astroshim] add navigation c808350 [astroshim] add image file and doc a3b0ded [astroshim] create dockerfile for mesos

running zeppelin on yarn

633c930

AhyoungRyu reviewed Aug 11, 2016
View reviewed changes

AhyoungRyu added 2 commits August 11, 2016 11:21

Minor update for spark_cluster_mode.md

9e9390c

Modify document description so that this docs can be searched

cde5f8d

Merge pull request #9 from AhyoungRyu/ZEPPELIN-1280-ahyoung

86ca513

Minor update for spark_cluster_mode.md

felixcheung reviewed Aug 11, 2016
View reviewed changes

astroshim added 2 commits August 11, 2016 19:59

fixed felixcheung pointed out.

8c62cf1

merge with Ayoung's

4c8d72d

astroshim added 2 commits August 12, 2016 22:04

update version

dad297c

Merge branch 'master' into ZEPPELIN-1280

6c44b7b

astroshim mentioned this pull request Aug 16, 2016

[ZEPPELIN-1336] hadoop library for spark interpreter is not match. #1335

Closed

astroshim changed the title ~~[ZEPPELIN-1280][Spark on Yarn] Documents for running zeppelin on production environments.~~ [ZEPPELIN-1280][Spark on Yarn] Documents for running zeppelin on production environments using docker. Aug 18, 2016

small changes for doc

60958cd

asfgit closed this in eccfe00 Aug 29, 2016

astroshim mentioned this pull request Aug 31, 2016

[ZEPPELIN-1279] Spark on Mesos Docker. #1389

Closed



		```
		ps -ef

[ZEPPELIN-1280][Spark on Yarn] Documents for running zeppelin on production environments using docker. #1318

[ZEPPELIN-1280][Spark on Yarn] Documents for running zeppelin on production environments using docker. #1318

Uh oh!

Conversation

astroshim commented Aug 10, 2016

What is this PR for?

What type of PR is it?

What is the Jira issue?

Questions:

Uh oh!

AhyoungRyu Aug 11, 2016

Choose a reason for hiding this comment

Uh oh!

astroshim Aug 11, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AhyoungRyu Aug 11, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AhyoungRyu commented Aug 11, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

astroshim commented Aug 11, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

felixcheung Aug 11, 2016

Choose a reason for hiding this comment

Uh oh!

astroshim commented Aug 11, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

felixcheung commented Aug 11, 2016

Uh oh!

astroshim commented Aug 11, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

astroshim commented Aug 11, 2016

Uh oh!

felixcheung commented Aug 11, 2016

Uh oh!

astroshim commented Aug 12, 2016

Uh oh!

bzz commented Aug 12, 2016

Uh oh!

astroshim commented Aug 12, 2016

Uh oh!

astroshim commented Aug 16, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

felixcheung commented Aug 16, 2016

Uh oh!

astroshim commented Aug 16, 2016

Uh oh!

astroshim commented Aug 16, 2016

Uh oh!

zjffdu commented Aug 16, 2016

Uh oh!

astroshim commented Aug 17, 2016

Uh oh!

astroshim commented Aug 18, 2016

Uh oh!

zjffdu commented Aug 18, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

astroshim commented Aug 18, 2016

Uh oh!

zjffdu commented Aug 18, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

astroshim commented Aug 18, 2016

Uh oh!

zjffdu commented Aug 18, 2016

Uh oh!

felixcheung commented Aug 18, 2016

Uh oh!

AhyoungRyu commented Aug 29, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

astroshim Aug 11, 2016 •

edited

Loading

AhyoungRyu Aug 11, 2016 •

edited

Loading

AhyoungRyu commented Aug 11, 2016 •

edited

Loading

astroshim commented Aug 11, 2016 •

edited

Loading

astroshim commented Aug 11, 2016 •

edited

Loading

astroshim commented Aug 11, 2016 •

edited

Loading

astroshim commented Aug 16, 2016 •

edited

Loading

zjffdu commented Aug 18, 2016 •

edited

Loading

zjffdu commented Aug 18, 2016 •

edited

Loading

AhyoungRyu commented Aug 29, 2016 •

edited

Loading